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Art Unit: 2654 

DETAILED ACTION 
Claim Rejections - 35 USC § 102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 



1. Claims 1, 4-7, 9, 10 and 12 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Addison et al. (U.S. Patent 6,865,533), hereinafter referred to as 
Addison. 



Regarding claim 1, Addison discloses a text-to-speech system that includes a 
method with the following steps: 

• receiving input text into a text-to-speech synthesizing system (Fig. 1, item 12; Fig. 
2, item 112; abstract, converting text into speech); 

• determining a topic for the input text (Fig. 2, item 114; col. 3, lines 50-63; col. 11, 
lines 58-68; col. 18, lines 20-28; when processing the text, artificial intelligence rules 
determine general informational content [topic]); 

• selecting a speaking style from a plurality of predefined speaking styles based on 
the identified topic, where each speaking style correlates to prosodic parameters and is 
associated with one or more anticipated topics (col. 1 1 , lines 45-67; styles: male, 
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female, methodical, etc.; col. 18, lines 20-29; determine the general informational 
context [topic]); col. 24, lines 15-21 ; a style is determined); and 

• converting the input text to audible speech using the prosodic parameters (Figs 1-3, 
item 34, speech output). 

Regarding claim 4, Addison teaches everything claimed, as applied above (see 
claim 1). In addition, Addison teaches the following: 

• converting the input text to corresponding phoneme data (Fig. 1 , item 22; col. 8, 
lines 33-39); 

. applying prosodic parameters to the phoneme data, thereby generating a prosodic 
representation of the phoneme data (Fig. 1, item 28; Fig. 2; items 114, 116, 120, and 
122); and 

• generating audible speech using the prosodic representation of the phoneme data 
(Fig. 1 , item 34, Figs 2 and 3, Speech Output). 

Regarding claim 5, Addison discloses a text-to-speech system that includes a 
method with the following steps: 

• receiving input text (Fig. 1 , item 12; Fig. 2, item 112); 

• determining semantic information for the input text (Fig. 2, item 1 14; col. 3, lines 50- 
63; col. 1 1 , lines 58-68; col. 18, lines 20-28; when processing the text, artificial 
intelligence rules determine general informational content); 
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• selecting a speaking style from a plurality of predefined speaking styles based on 
the identified topic, where each speaking style correlates to prosodic parameters and is 
associated with one or more anticipated topics (col. 11, lines 45-67; styles: male, 
female, methodical, etc.; col. 18, lines 20-29; determine the general informational 
context [topic]); col. 24, lines 15-21; a style is determined); and 

• customizing an output parameter of a multimedia user interface of the text-to- 
speech synthesizer system based on the speaking style, where the text-to-speech 
synthesizer system is operable to render audible speech which correlates to the input 
text (Figs 1-3, item 34, speech output system). 

Regarding claim 6, Addison teaches everything claimed, as applied above (see 
claim 5). In addition, Addison teaches "the step of determining semantic information 
further comprises determining a topic for the input text" (Fig. 2, item 114; col. 3, lines 
50-63; col. 11, lines 58-68; col. 18, lines 20-28; when processing the text, artificial 
intelligence rules determine general informational content [topic]). 

Regarding claim 7, Addison teaches everything claimed, as applied above (see 
claim 5). In addition, Addison teaches "the step of determining semantic information 
further comprises partitioning the input text into a plurality of context spaces, and 
determining a topic for each of the plurality of context spaces" (col. 3, line 63 through 
col. 4, line 3). 
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Regarding claim 9, Addison teaches everything claimed, as applied above (see 
claim 5). In addition, Addison teaches "the step of customizing an output parameter 
further comprises generating synthesized speech" (Figs. 1-3, item 34, Speech output). 

Regarding claim 10, Addison teaches everything claimed, as applied above (see 
claim 5). In addition, Addison teaches "the step of customizing an output parameter 
further comprises correlating the selected speaking style to one or more prosodic 
parameters and rendering audible speech for the input text using the prosodic 
parameters" (col. 3, lines 50-64). 

Regarding claim 12, Addison discloses a text-to-speech system with the 
following components: 

• a text analyzer receptive of input text and operable to determine semantic 
information for the input text (Fig. 1, item 12; Fig. 2, item 1 12, 114); 

• a style selector adapted to receive semantic information from the text analyzer and 
operable to determine a speaking style for rending the input text based on the semantic 
information, where the selected speaking style correlates to one or more prosodic 
attributes (col. 24, lines 15-21; a style is determined; Fig. 2, items 114, 116, 120, 122); 

• a phonetic analyzer adapted to receive input text from the text analyzer and 
operable to convert the input text into corresponding phoneme data (Fig. 1, items 22 
and 26); 
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• a prosodic analyzer adapted to receive phoneme data from the phonetic analyzer 
and the prosodic attributes from the style selector, the prosodic analyzer further 
operable to apply the prosodic attributes to the phoneme data to form a prosodic 
representation of the phoneme data (Figs. 1-3, items 26, 28, 116, 120, 122, 142); and 

• a speech synthesizer adapted to receive the prosodic representation of the 
phoneme data from the prosodic analyzer and operable to generate audible speech 
(Figs 1 and 3, Speech Output). 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

2. Claim 2 is rejected under 35 U.S.C. 103(a) as being unpatentable over Addison 

in view of Apte et al. (U.S. Patent 6,253,169), hereinafter referred to as Apte. 

Regarding claim 2, Addison teaches everything claimed, as applied above (see 
claim 1). In addition, Addison teaches that the text can be analyzed by the artificial 
intelligence unit to determine a topic (col. 1 1 , lines 53-67; col. 18, lines 20-29; where 
the analysis will necessarily involve the words represented in the text), which 
corresponds to "defining a plurality of anticipated topics, such that each anticipated 
topic is associated with keywords that are indicative of the topic." But Addison does 
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not specifically teach "determining frequency of the keywords in the input text; and 
selecting a topic for the input text from the plurality of anticipated topics based on the 
frequency of keyword occurrences contained therein." However, the examiner 
contends that these concepts were well known in the art, as taught by Apte. 

In the same field of endeavor, Apte discloses a method for improving the 
accuracy of decision tree based text categorization. Apte's teachings include 
determining the frequency of words [keywords] in a document [text] to classify 
[associate a topic with] that document (col. 1 , lines 45-65). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Addison by specifically providing the 
features, as taught by Apte, because it is well known in the art at the time of invention 
as an effective means of assigning a topic to text. 

3. Claim 1 1 is rejected under 35 U.S.C. 103(a) as being unpatentable over Addison 
in view of Sutton et al. (U.S. Patent 6,539,354), hereinafter referred to as Sutton. 

Regarding claim 11, Addison teaches everything claimed, as applied above (see 
claim 5). But Addison does not specifically teach "the step of customizing an output 
parameter further comprises modifying at least one of an expression of a visually 
displayed talking head and another attribute of a visual display." However, the 
examiner contends that this concept was well known in the art, as taught by Sutton. 
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In the same field of endeavor, Sutton discloses methods and devices for 
producing and using synthetic visual speech [facial animations] based on natural 
coarticulation. In addition, Sutton teaches that the animation can support various voice 
characteristics and emotions (Figs. 5A and 6; col. 4, lines 15-20; col. 14, lines 1-17; e.g. 
a character emotion can be specified -smile+jawdown+headright). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to modify Addison by specifically providing the 
animation functionality, as taught by Sutton, because it is well known in the art at the 
time of invention for the purpose of producing realistic visual lipsyncing (col. 2, line 55 
through col. 3, line 19). 



Response to Arguments 

4. Applicant's arguments filed 8/02/2005 have been fully considered but they are 
not persuasive. 



5. Applicant asserts on page 6: 

Addison is directed generally to a method for converting text into 
synthesized speech. Of interest, the Examiner asserts that Addison 
discloses selecting a speaking style based on an identified topic of the 
input text. The Examiner relies on column 24, lines 15-21 to teach this 
aspect of the present invention. Applicant notes that this portion of the 
reference teaches a technique for selecting preferred pronunciation rules 
(see col. 24, lines 54-59), where the pronunciation rules may embody a 
particular expressive style. In this case, the speaking style appears to be 
selected by a certified Lessac practitioner (see col. 23, lines 48-51 ), but it 
is unclear as to how the speaking style is selected by the practitioner. 
Moreover, once a final rule selection occurs, artificial intelligence 
processing is used to select a suitable pronunciation rule from amongst 
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the final rule set which may then be applied to input text. Again, it is 
unclear as to how the artificial intelligence processing selects a suitable 
rule. At best, the selection appears to be based on the likely listener (see 
col. 23, Iines16-1 9). Therefore, Applicant respectfully asserts that Addison 
fails to teach or suggest selecting a speaking style base on an identified 
topic of the input text. (Italics added) 



As stated in the rejection, Addison teaches "the artificial intelligence 
assessment of the text that suggests the style that is appropriate to the content 
of the message to be conveyed and which, in turn is used to drive the acoustic 
profile of rhythm, tone change, ... for the words in the text to be synthesized." 
(col. 24, lines 14-20; Fig. 2, e.g. item 114; described in col. 13, lines 45-52 as 
artificial intelligence processing). Furthermore, Addison teaches the use of "rules 
which look to such features in the text as the identity of the speaker, ... and the 
nature of the text [topic]" (col. 1 1 , lines 49-67, e.g. "the text relates to the sea" is 
a determination of topic). Thus, the examiner maintains that Addison teaches 
"selecting a speaking style based on an identified topic of the input text." 



6. Applicant asserts on page 7: 

Applicant's invention is likewise directed to a method for generating 
synthesized speech. However, Applicant's claimed invention recites 
"selecting a speaking style from a plurality of predefined speaking styles 
based on the identified topic, where each speaking style correlates to 
prosodic parameters" In combination with other elements of the claims. 
Independent Claims 1 and 5 have been amended to more clearly define 
this aspect of the present invention. Since Addison fails to disclose this 
aspect of the present invention, it is respectfully submitted that Applicant's 
claimed invention defines patentable subject matter over Addison. 
Accordingly, Applicant respectfully requests reconsideration and 
withdrawal of this rejection. (Italics added) 
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See arguments in §5, above, concerning the determination of topic. 
Furthermore, Addison gives an example were the artificial intelligence is used to 
determine whether the text indicates that the speaker is slow and methodical, or rapid 
(col. 11, lines 60-65), which the examiner interprets as selecting a speaking style [slow, 
...etc.] based on an identified topic. 

Conclusion 

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the mailing date of this final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to V. Paul Harper whose telephone number is (571) 272- 
7605. The examiner can normally be reached on M-F. 



Application/Control Number: 10/083,839 



Page 1 1 



Art Unit: 2654 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



9/15/05 




SUPERVISORY PATENT EXAMINE! 



V. Paul Harper 
Patent Examiner 
Art Unit 2654 




